.. _tutorial_imagemosaic_update_to_s3:

COG ImageMosaic from local storage to S3
========================================

Introduction
------------

This tutorial provides instructions to update an existing ImageMosaic built on top of local granules to a COG ImageMosaic with granules stored on S3 bucket.
It is aimed to users that want to move COG granules of an ImageMosaic to a remote bucket without the need of re-harvesting the whole collection of granules.

Assumptions
-----------
* An ImageMosaic store already exists, with its index based on a DB (i.e. PostGIS).
* Local GeoTIFF granules are already valid COGs.
* User has experience with uploading data on S3.

Verifying data is valid COG
"""""""""""""""""""""""""""
Verifying that a sample GeoTIFF is a valid COG can be achieved using COG validator service.

#. Store a sample GeoTIFF to the target bucket (or to the server location) you will use for remote serving and copy its full URL location, i.e. `<https://modis-vi-nasa.s3-us-west-2.amazonaws.com/MOD13A1.006/2018.01.01.tif>`_.
#. Go to `COG Validator <http://cog-validate.radiant.earth/html>`_
#. Paste the sample COG URL in the text box and hit the submit button.
#. In case the sample file is a valid COG, you will get a message like this:

``Cloud Optimized GeoTIFF Validator: result Validation succeeded ! 
https://sample.s3.eu-central-1.amazonaws.com/test/cog.tif 
is a valid Cloud Optimized GeoTIFF.``

In case the file isn't a valid COG, you can use GDAL 3.1 or above to convert your file to COG format. 
See the related `GDAL documentation <https://gdal.org/drivers/raster/cog.html>`_ for further details.

Once the data has been verified, all of your granules need to be stored to an S3 bucket.

ImageMosaic update
------------------
Next step is updating both the ImageMosaic's config as well as the index.

ImageMosaic configuration update
""""""""""""""""""""""""""""""""
A few new properties need to be added to the ImageMosaic configuration to support COG.

Locate the ``.properties`` file containing the mosaic configuration. It's usually a ``.properties`` file having the same name of the parent folder.
You may recognize it since it's usually being autogenerated during first ImageMosaic configuration and it contains this header:


``#-Automagically created from GeoTools-``. 

Let's assume it's named :file:`mosaic.properties` for simplicity for future references in this documentation.
Once located, edit that file by adding these new properties:

* ``Cog=true``
* ``SuggestedSPI=it.geosolutions.imageioimpl.plugins.cog.CogImageReaderSpi``

When storing your granules on a public bucket, you may stick with the default RangeReader implementation so no other flags are needed and you can skip to the ImageMosaic index update paragraph.

In case you are using a private bucket instead, you need to specify additional properties to the mosaic.properties file:

* ``CogRangeReader=it.geosolutions.imageioimpl.plugins.cog.S3RangeReader``
* ``CogUser=S3AccessKeyID``
* ``CogPassword=S3SecretAccessKey``

Where the ``S3AccessKeyID`` and ``S3SecretAccessKey`` are the actual values needed to access that bucket.

ImageMosaic index update
""""""""""""""""""""""""
The next step is updating the ImageMosaic index which is a catalog of all the granules composing the mosaic.
We need to update the location values to refer to remote URLs instead of local paths on disk.
The location attribute initially contains the path of each granule on disk, which can be either a relative or an absolute path.
Relative paths are relative to the ImageMosaic parent configuration folder whilst absolute paths are full paths.

The :file:`mosaic.properties` file contains a ``PathType`` property set to ``RELATIVE`` or ``ABSOLUTE``.
On old mosaics, that property might be missing and ``AbsolutePath`` property exists instead with a boolean value true/false.
Based on that, note that all the paths of the same mosaic will be either relative or absolute.

To give you an example, an ImageMosaic stored at :file:`/var/data/imageMosaic/mosaic` with a granule at :file:`/var/data/imageMosaic/mosaic/2018.01.01.tif`
may have a record in the database with location attribute equal to : 

* ``2018.01.01.tif`` in case of relative path 
* ``/var/data/imageMosaic/mosaic/2018.01.01.tif`` in case of absolute path.

The type of path affects the query to be executed to update the index.

.. note:: Make sure to backup your table for a quick recovery in case of messes with the updating query.

For this example, we are going to use the same public datasets from S3 Urls being used in the previous :ref:`tutorial_imagemosaic_cog_landsat8` section.


For location with relative paths a simple replacing query could be like this:

.. code-block:: sql
    
    UPDATE schema.table SET location=CONCAT(
    'https://modis-vi-nasa.s3-us-west-2.amazonaws.com/MOD13A1.006/', location);

So we are basically prepending the S3 bucket URL to the location value.
By this way, based on the above examples, 
``location=2018.01.01.tif`` will become ``location='https://modis-vi-nasa.s3-us-west-2.amazonaws.com/MOD13A1.006/2018.01.01.tif``


For location with absolute path, a replacing query may be like this (for our example):

.. code-block:: sql
    
    UPDATE schema.table SET location=REPLACE(location,'/var/data/imageMosaic/mosaic/',
    'https://modis-vi-nasa.s3-us-west-2.amazonaws.com/MOD13A1.006/');

GeoServer reload
----------------
Once everything is done, reload the GeoServer configuration.
